refactor: fuse set_world_poses/set_scales into single _compose_fabric_transform by pv-nvidia · Pull Request #5674 · isaac-sim/IsaacLab

pv-nvidia · 2026-05-18T12:52:14Z

Problem

FabricFrameView had duplicated kernel-launch logic in set_world_poses and set_scales, and the initial USD→Fabric sync called both methods sequentially. This meant:

Duplicated code — the same 20-line kernel launch pattern appeared twice with minor variations (positions/orientations vs scales).
Double PrepareForReuse — the initial USD→Fabric sync in _sync_fabric_from_usd_once called set_world_poses then set_scales, each invoking PrepareForReuse. A second non-idempotent PrepareForReuse call could mask a topology-change signal that should have triggered a fabricarray rebuild.

Solution

Extract _compose_fabric_transform(positions=None, orientations=None, scales=None, indices=None) — a single method that composes any subset of transform components into one kernel launch. Components left as None are skipped via empty arrays.

set_world_poses → delegates to _compose_fabric_transform(positions=..., orientations=...)
set_scales → delegates to _compose_fabric_transform(scales=...)
_sync_fabric_from_usd_once → single fused call with all three components

Additional fix

The topology-change invariant guard in _rebuild_fabric_arrays used assert, which is stripped under python -O. Replaced with raise RuntimeError so it's always active.

Tests

All 36 existing Fabric tests pass (+ 2 xfail).

Merge Order

This PR is part of a stacked series for Fabric-accelerated local poses:

feat: add typed service locator to SimulationContext #5672 — service locator ← this depends on
feat: add FabricStageCache service for shared hierarchy handles #5676 — FabricStageCache
feat: add indexed fabric transform kernels for local/world matrix propagation #5675 — indexed fabric kernels
refactor: move Fabric/USD dispatch from FabricFrameView to FrameView factory #5673 — factory dispatch
refactor: fuse set_world_poses/set_scales into single _compose_fabric_transform #5674 — fused compose ← this PR (depends on feat: add typed service locator to SimulationContext #5672)
PR C+D — local poses (depends on all above)

isaaclab-review-bot

🔍 Code Review Summary

This PR successfully refactors FabricFrameView to consolidate duplicate kernel-launch logic into a single _compose_fabric_transform method and addresses a critical assertion bug. The changes are well-structured and improve both correctness and maintainability.

✅ Correctness

No blocking issues found.

The refactoring maintains behavioral equivalence:

_compose_fabric_transform correctly handles None parameters via empty arrays, preserving the kernel's ability to skip unspecified components
The _fabric_usd_sync_done flag is set correctly after writes
The assert → RuntimeError change in _rebuild_fabric_arrays is a valuable fix — assertions are stripped under python -O, which could silently produce wrong poses or out-of-bounds kernel indices in production

Minor observation:

The _compose_fabric_transform method always calls _fabric_hierarchy.update_world_xforms() even when only scales are updated. This matches the original set_scales behavior, so no regression, but worth noting for future optimization.

⚡ Performance

Net positive impact:

Reduced PrepareForReuse calls — The critical fix: _sync_fabric_from_usd_once now invokes PrepareForReuse exactly once (via the fused compose) instead of twice. This prevents potential topology-change signal masking.
Single kernel launch for initial sync — USD→Fabric sync now uses one wp.launch instead of two, reducing kernel launch overhead.

Micro-optimization opportunity (non-blocking):

# Current: creates empty arrays every call
empty3 = wp.zeros((0, 3), dtype=wp.float32, device=self._device)
empty4 = wp.zeros((0, 4), dtype=wp.float32, device=self._device)

These could be cached as class-level constants, but the overhead is negligible for zero-sized arrays.

📝 Maintainability

Excellent improvements:

Code deduplication — ~40 lines of duplicate kernel-launch boilerplate consolidated into one reusable method
Clearer naming — empty3/empty4 communicates intent better than dummy3/dummy4
Good documentation — The _compose_fabric_transform docstring clearly explains the single-PrepareForReuse guarantee and the _fabric_usd_sync_done side effect
Test modernization — wp.to_torch(fab_pos) → fab_pos.torch uses the current ProxyArray API

Changelogs are well-formatted and accurately describe both the behavioral changes and the assert fix.

🔗 Service Locator (Bundled Feature)

The typed service locator in SimulationContext is a clean pattern:

Type-safe via TypeVar
Automatic close() on clear_instance()
Good test coverage for edge cases (re-registration, missing close method, etc.)

One consideration: set_service does not auto-close the replaced service (caller responsibility per docstring). This is documented but could be a subtle footgun. Consider adding an optional auto_close=True parameter in the future if replacement becomes common.

Verdict: Approve ✅

The refactoring achieves its goals cleanly:

Eliminates duplicate code
Fixes a real bug (assert under -O)
Reduces PrepareForReuse call count
Adds useful infrastructure (service locator)

All 36 existing Fabric tests pass per PR description. Ship it! 🚀

…factory FabricFrameView had an internal _use_fabric flag that fell back to UsdFrameView when Fabric was disabled or the device was unsupported. This violated single-responsibility: FabricFrameView pretended to be one class but sometimes behaved as another. Now the FrameView factory handles all dispatch: - PhysX + Fabric enabled + supported device → FabricFrameView - PhysX without Fabric (or unsupported device) → UsdFrameView - Newton → NewtonSiteFrameView FabricFrameView no longer checks _use_fabric or _fabric_supported_devices. It assumes Fabric is available (the factory guarantees this). UsdFrameView is eagerly registered on the factory since it lives in isaaclab (not a backend package), so FactoryBase's dynamic import (isaaclab_{backend}.sim.views) can't discover it.

Extract _compose_fabric_transform() to deduplicate the kernel-launch logic shared by set_world_poses and set_scales. The initial USD->Fabric sync now composes position, orientation, and scale in one call, so PrepareForReuse is invoked exactly once per logical update. Also replace assert with RuntimeError in _rebuild_fabric_arrays so the topology-change guard survives python -O.

isaaclab-review-bot

🔍 Code Review Summary

This PR successfully refactors FabricFrameView to consolidate duplicate kernel-launch logic into a single _compose_fabric_transform method and moves Fabric/USD dispatch to the FrameView factory. The changes improve both correctness and maintainability.

✅ Correctness

No blocking issues found.

The refactoring maintains behavioral equivalence:

_compose_fabric_transform correctly handles None parameters via empty arrays, preserving the kernel's ability to skip unspecified components
The _fabric_usd_sync_done flag is correctly set after writes
The assert → RuntimeError change in _rebuild_fabric_arrays is a valuable fix — assertions are stripped under python -O, which could silently produce wrong poses or out-of-bounds kernel indices in production
Factory dispatch logic in FrameView._get_backend() correctly checks both fabricEnabled setting and device support before routing to FabricFrameView

Minor observation:

The _compose_fabric_transform method always calls _fabric_hierarchy.update_world_xforms() even when only scales are updated. This matches the original set_scales behavior, so no regression, but worth noting for future optimization.

⚡ Performance

Net positive impact:

Reduced PrepareForReuse calls — The critical fix: _sync_fabric_from_usd_once now invokes PrepareForReuse exactly once (via the fused compose) instead of twice. This prevents potential topology-change signal masking.
Single kernel launch for initial sync — USD→Fabric sync now uses one wp.launch instead of two, reducing kernel launch overhead.

Micro-optimization opportunity (non-blocking):

# Current: creates empty arrays every call
empty3 = wp.zeros((0, 3), dtype=wp.float32, device=self._device)
empty4 = wp.zeros((0, 4), dtype=wp.float32, device=self._device)

These could be cached as class-level constants, but the overhead is negligible for zero-sized arrays.

📝 Maintainability

Excellent improvements:

Code deduplication — ~40 lines of duplicate kernel-launch boilerplate consolidated into one reusable method
Single-responsibility — FabricFrameView no longer contains dispatch logic; it assumes Fabric is available (factory guarantees this)
Clearer naming — empty3/empty4 communicates intent better than dummy3/dummy4
Good documentation — The _compose_fabric_transform docstring clearly explains the single-PrepareForReuse guarantee and the _fabric_usd_sync_done side effect
Test modernization — wp.to_torch(fab_pos) → fab_pos.torch uses the current ProxyArray API

Changelogs are well-formatted and accurately describe both the behavioral changes and the assert fix.

🏭 Factory Dispatch (frame_view.py)

The dispatch logic is clean and well-structured:

FrameView._get_backend() checks SimulationContext for physics manager type
PhysX path now explicitly checks fabricEnabled setting and device compatibility
UsdFrameView is eagerly registered since it lives in isaaclab, not a backend package
Good warning message when Fabric is enabled but device is unsupported

One nit (non-blocking): The _FABRIC_SUPPORTED_DEVICES tuple could potentially be shared between frame_view.py and any other code that needs it (currently duplicated between factory and docs), but this is minor.

⚠️ CI Status

The "Check changelog fragments" job is failing. Please verify the changelog fragment format matches project conventions.

Verdict

The refactoring achieves its goals cleanly:

✅ Eliminates duplicate kernel-launch code
✅ Fixes a real bug (assert stripped under -O)
✅ Reduces PrepareForReuse call count in initial sync
✅ Moves dispatch responsibility to the factory (single-responsibility principle)
✅ All 36 existing Fabric tests pass per PR description

Once the changelog CI check passes, this is ready to merge. 🚀

# Description Removes the `cuda:0`-only restriction in `FabricFrameView`. USDRT `SelectPrims` now accepts any CUDA device index, so Fabric acceleration runs on the simulation device (e.g., `cuda:1`) instead of silently falling back to the slower USD path. This unblocks distributed training where each process is pinned to a specific GPU. Changes: - **Drop device allowlist.** Removes `_fabric_supported_devices`, the device guard in `__init__`, and the corresponding assertion in `_initialize_fabric`. Any CUDA device (or CPU) now works. - **Multi-GPU test coverage.** Three `cuda:1`-parameterized tests gated by `ISAACLAB_TEST_MULTI_GPU=1` env var, plus a dedicated CI workflow on the multi-GPU runner that sets it. - **Fix deprecated `wp.to_torch()` calls.** Replaced with `.torch` accessor on ProxyArray (avoids DeprecationWarning). - **TODOs for follow-up PRs.**: - #5673 - #5674 ## Type of change - New feature (non-breaking change which adds functionality) `cuda:0` continues to work exactly as before; `cuda:1`+ now also works instead of silently falling back to USD. No public API surface changed. ## Checklist - [x] I have read and understood the [contribution guidelines](https://isaac-sim.github.io/IsaacLab/main/source/refs/contributing.html) - [x] I have run the [`pre-commit` checks](https://pre-commit.com/) with `./isaaclab.sh --format` - [x] I have made corresponding changes to the documentation - [x] My changes generate no new warnings - [x] I have added tests that prove my fix is effective or that my feature works - [x] I have updated the changelog and the corresponding version in the extension's `config/extension.toml` file - [x] I have added my name to the `CONTRIBUTORS.md` or my name already exists there > Note: this PR uses a fragment file at `source/isaaclab_physx/changelog.d/feat-frame-view-enable-mgpu.rst` per the fragment-based changelog system. ## Test plan Three new tests gated by `ISAACLAB_TEST_MULTI_GPU=1` and parameterized with `["cuda:1"]`: - `test_fabric_cuda1_world_pose_roundtrip` — `set_world_poses` → `get_world_poses` returns the same values on a non-primary CUDA device. - `test_fabric_cuda1_no_usd_writeback` — Fabric writes on `cuda:1` do not write back to USD. - `test_fabric_cuda1_scales_roundtrip` — covers the `set_scales` write path on `cuda:1`. A dedicated CI workflow (`test-fabric-multi-gpu.yaml`) runs on the `[self-hosted, linux, x64, gpu, multi-gpu]` runner with `ISAACLAB_TEST_MULTI_GPU=1` set. Pre-flights with `nvidia-smi` and `torch.cuda.device_count()`, fails loudly if the runner has < 2 GPUs. To verify locally on a multi-GPU machine: ```bash ISAACLAB_TEST_MULTI_GPU=1 ./isaaclab.sh -p -m pytest \ source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py -v ``` To verify the `cuda:0` path is unchanged (multi-GPU tests auto-skip): ```bash ./isaaclab.sh -p -m pytest \ source/isaaclab_physx/test/sim/test_views_xform_prim_fabric.py -v ```

pv-nvidia · 2026-05-22T13:11:30Z

Superseded by the consolidated PR #5728 (pv/fabric-full-stack).

github-actions Bot added the isaac-lab Related to Isaac Lab team label May 18, 2026

pv-nvidia force-pushed the pv/fabric-fused-compose branch from 8b5f661 to 2ca6310 Compare May 18, 2026 13:36

isaaclab-review-bot Bot reviewed May 18, 2026

View reviewed changes

pv-nvidia force-pushed the pv/fabric-fused-compose branch 2 times, most recently from 1b12c6c to aad8a83 Compare May 18, 2026 17:10

pv-nvidia added 2 commits May 20, 2026 13:53

pv-nvidia force-pushed the pv/fabric-fused-compose branch from aad8a83 to b0206ce Compare May 20, 2026 14:11

isaaclab-review-bot Bot reviewed May 20, 2026

View reviewed changes

isaaclab-review-bot Bot mentioned this pull request May 20, 2026

Enable mgpu in FrameView #5514

Merged

7 tasks

pv-nvidia mentioned this pull request May 21, 2026

feat: Full Fabric acceleration stack — local poses, stage cache, fused compose #5728

Draft

pv-nvidia closed this May 22, 2026

pv-nvidia reopened this May 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: fuse set_world_poses/set_scales into single _compose_fabric_transform#5674

refactor: fuse set_world_poses/set_scales into single _compose_fabric_transform#5674
pv-nvidia wants to merge 2 commits into
isaac-sim:developfrom
pv-nvidia:pv/fabric-fused-compose

pv-nvidia commented May 18, 2026 •

edited

Loading

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

isaaclab-review-bot Bot left a comment

Uh oh!

pv-nvidia commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pv-nvidia commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Solution

Additional fix

Tests

Merge Order

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

🔍 Code Review Summary

✅ Correctness

⚡ Performance

📝 Maintainability

🔗 Service Locator (Bundled Feature)

Verdict: Approve ✅

Uh oh!

isaaclab-review-bot Bot left a comment

Choose a reason for hiding this comment

🔍 Code Review Summary

✅ Correctness

⚡ Performance

📝 Maintainability

🏭 Factory Dispatch (frame_view.py)

⚠️ CI Status

Verdict

Uh oh!

pv-nvidia commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pv-nvidia commented May 18, 2026 •

edited

Loading